ggraphOne of the defining principles of Social Network Analysis is that it draws heavily on graphic imagery. Hence, for performing those analysis, it is crucial to learn how to properly depict the data. For this tutorial, I will – again – use the dataset containing Congress members’ agreement on abortion bills. First, I need to read in the data and create the tbl_graph.
library(tidyverse)
library(tidygraph)
library(ggraph)
abortion_nodes <- read_csv("data/abortion-bills/nodelist_abortion_bills.csv",
col_types = cols(
id = col_character(), # node names need to be characters
female = col_integer(),
democrat = col_integer())) %>%
mutate(party = case_when(democrat == 1 ~ "Democrat",
democrat == 0 ~ "Republican"),
party = as_factor(party),
gender = case_when(female == 1 ~ "female",
female == 0 ~ "male"),
gender = as_factor(gender)) %>%
select(-democrat, -female)
abortion_edges <- read_csv("data/abortion-bills/edgelist_abortion_bills.csv",
col_types = cols(
from = col_character(), # node names need to be characters
to = col_character(), # node names need to be characters
weight = col_double()
)) %>%
group_by(from, to) %>%
summarize(weight = sum(weight)) %>%
mutate(weight = as.integer(floor(weight))) %>%
filter(weight > 0)
sample_edges <- abortion_edges %>%
slice(sample(nrow(.), 1000))
sample_nodes <- abortion_nodes %>%
filter(id %in% sample_edges$from | id %in% sample_edges$to)
abortion_graph <- tbl_graph(nodes = sample_nodes,
edges = sample_edges)
Now that the tbl_graph is created, you might want to get a quick overview of it. You can achieve this using autograph():
autograph(abortion_graph)
In this case, the network is way too dense to display it in a meaningful way using autograph(). I could probably change some arguments in autograph. However, doing some proper ggraph manipulation might be easier – so I will do it using proper ggraph.
Maybe some of you are familiar with the “layered grammar of graphics” [see, for instance, Wickham (2010) for more information). The probably most popular R package for visualizing data, ggplot2, is making good use of it. In a nutshell, every plot consists of layers and every layer adds a new feature to it. An example for our graph with a nice layout (“stress”), nodes colored according to their party affiliation (Democrats are blue, Republicans red), and the edge alpha (their transparency) according to their edge weights would look like this:
ggraph(abortion_graph) +
labs(caption = "Basic plot")
## Using `stress` as default layout
ggraph(abortion_graph) +
geom_edge_link0(aes(edge_alpha = weight), edge_color = "grey66") +
labs(caption = "Basic plot + edges")
## Using `stress` as default layout
ggraph(abortion_graph) +
geom_edge_link0(aes(edge_alpha = weight), edge_color = "grey66") +
geom_node_point(aes(fill = party), shape = 21) +
labs(caption = "Basic plot + edges + nodes")
## Using `stress` as default layout
ggraph(abortion_graph) +
geom_edge_link0(aes(edge_alpha = weight), edge_color = "grey66") +
geom_node_point(aes(fill = party), shape = 21) +
scale_fill_manual(values = c(Republican = "red", Democrat = "blue")) +
labs(caption = "Basic plot + edges + nodes + colors aligned to party")
## Using `stress` as default layout
So, when plotting networks with ggraph, you basically need to provide it at least three things: the layout you want to use, what you want to do with the edges, and what you want to do with the nodes.
A layout basically determines where the different nodes are placed on the x and y axis. It is determined by a layout algorithm. Which one to choose is largely based on the graph you want to depict. However, the default choice is layout = "stress" which produces fairly nice layouts for the most graphs. Hence, you should not worry too much about it. You usually provide the layout in the initial ggraph call.
ggraph(abortion_graph, layout = "fr") +
geom_edge_link0(aes(edge_alpha = weight), edge_color = "grey66") +
geom_node_point(aes(fill = party), shape = 21) +
labs(caption = "Basic plot + edges + nodes")
The GIF here shows a couple of layouts. It was created by Thomas Lin Pedersen, the author of the ggraph package.